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Abstract 

Common Wireless LAN (WLAN) pathologies include 
low signal-to-noise ratio, congestion, hidden terminals or 
interference from non-802.11 devices and phenomena. 
Prior work has focused on the detection and diagnosis 
of such problems using layer-2 information from 802.1 1 
devices and special-purpose access points and monitors, 
which may not be generally available. Here, we investi- 
gate a user-level approach: is it possible to detect and di- 
agnose 802.11 pathologies with strictly user-level active 
probing, without any cooperation from, and without any 
visibility in, layer-2 devices? In this paper, we present 
preliminary but promising results indicating that such di- 
agnostics are feasible. 

1 Introduction 

Most home networks today use an 802. 1 1 Wireless LAN 
(WLAN) with a single Access Point (AP), typically 
operating in Distributed Coordination Function (DCF) 
mode. Home WLANs often suffer from various perfor- 
mance pathologies, such as low signal strength, signif- 
icant noise, interference from external non-802.11 de- 
vices and physical phenomena, various forms of fading, 
hidden terminals from devices in the same WLAN or in 
nearby WLANs, or congestion. These pathologies can 
result in throughput degradation, significant jitter and 
packet losses. To make things worse, due to the wireless 
nature of the medium, troubleshooting WLAN perfor- 
mance is hard even for experts, leave alone home users. 

User-level probing is a well-established research area 
in wired networks and it is used in practice to infer var- 
ious properties and problems in such networks. In the 
wireless domain, on the other hand, it is still unclear 
whether user-level probing can be nearly as effective. A 
main motivation behind this work is to answer the fol- 
lowing "intellectual curiosity" question: is it possible to 
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diagnose common WLAN performance problems using 
active probing, without any information from, or modifi- 
cations to, the 802.11 devices or AP? The methods pre- 
sented in this paper show promising (but preliminary) re- 
sults, potentially opening a new research thread within 
the area of wireless networks. 

Specifically, our objective is to construct a user-level 
tool for any 802.1 1 DCF WLAN that can detect: a) low 
Signal-to-Noise Ratio (SNR), b) Hidden Terminals (HT), 
or c) congestion. The methodology we propose, referred 
to as WLAN-probe, is a simple, easy-to-use, client-server 
application that eliminates the need for vendor-specific 
network card (NIC), driver, AP, monitoring devices, or 
network modifications. It is also portable across plat- 
forms, since it only requires a user-level socket library 
(e.g., Berkeley sockets, POSIX, or Winsock APIs). 

There are several reasons for a user-level probing tool: 

Usability: We want to build a diagnostic tool that would 
not require the user to install a specific NIC, AP, or 
modify the kernel (moreover, it would not require 
administrative privileges). The user would just run 
a single instance of the WLAN-probe client at the 
wireless Unk that appears problematic. 

Hardware-agnostic: Most wireless cards today export 
some form of signal strength; for example, the Re- 
ceived Signal Strength Indicator (RSSI). RSSI im- 
plementations are vendor-specific and they are not 
uniform across NICs. A user-level approach avoids 
the need to calibrate NIC statistics across devices 
and drivers on different OSes. 

Software-agnostic: A user-level approach elimi- 
nates the need to write and maintain a hardware- 
compatibility layer for different OSes that would 
expose NIC statistics at user-space. An example 
of that approach is WRAPI [l], designed to work 
on Windows XP with NICs supporting NDIS 5.1 
drivers. 
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Passive inference: Understanding active probing in 
the wireless domain may also enable methods for 
passive inference. For example, is it possible to 
troubleshoot client performance at a remote web or 
video server using strictly application traffic? 

State-of-the-art diagnosis tools require (or modify) 
vendor-specific drivers and NICs, or they require spe- 
cial monitors at the home network; we cover these ap- 
proaches in the related work section. 

WLAN-probe is based on two fundamental effects: a) 
the fact that low-SNR and HTs cause a dependency be- 
tween packet size and retransmission probability, while 
congestion does not do so, and b) the fact that low-SNR 
conditions differ significantly from HTs in the delay or 
loss temporal correlations they create. However, mea- 
suring layer-2 retransmissions and delays is not feasible 
without information from the link layer. In this short pa- 
per, we present the basic ideas and algorithms for user- 
level inference of link layer effects, with a limited testbed 
evaluation. In future work, we will conduct a more ex- 
tensive evaluation, experiment with actual deployment at 
several home networks, and expand the set of diagnosed 
pathologies. 

We consider the following architecture, which is typ- 
ical for most home WLANs (see Figure [T]l. A single 
802.11 AP is used to interconnect a number of wireless 
devices; we do not make any assumptions about the exact 
type of the 802. 11 devices or AR We assume that another 
computer, used as our WLAN-probe measurement server 
S is connected to the AP through an Ethernet connection. 
This is not difficult in practice given that most APs pro- 
vide an Ethernet port, as long as the user has at least two 
computers at home. The key requirement for the server 
S and its connection to the WLAN AP is that it should 
not introduce significant jitter (say more than l-3msec). 
The server S allows us to probe the WLAN channel with- 
out demanding ping-like replies from the AP and without 
distorting the forward-path measurements with reverse- 
path responses. The measurements can be conducted ei- 
ther from C to 5 or from 5 to C to allow diagnosis of both 
channel directions; we focus on the former Note that 
some APs or terminals that are not a part of our WLAN 
may be nearby (e.g., in other home networks) creating 
hidden terminals and/or interference, while the user has 
no control over these networks. 

We have conducted all experiments in this paper using 
a testbed that consists of 802. 1 Ig Soekris net4826 nodes 
with mini-PCI interfaces. The mini-PCI interfaces host 
either an Atheros chipset or an Intel 2915ABG chipset, 
with the MadWiFi and ipw2200 drivers respectively (on 
the Linux 2.6.21 kernel). The MadWiFi driver allows 
us to choose between four rate adaptation modules. We 
disable the optional MadWiFi features referred to as fast 
frames and bursting because they are specific to Mad- 
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Figure 1: System architecture. 




Figure 2: Testbed layout. 



WiFi's Super-G implementation and they can interfere 
with the proposed rate inference process. The testbed is 
housed in the College of Computing at Georgia Tech, and 
the geography is shown in Figure |2] 



Related work 

There is significant prior work in the area of WLAN 
monitoring and diagnosis. However, to the extent of 
our knowledge, there is no earlier attempt to diag- 
nose WLAN problems using exclusively user-level ac- 
tive probing, without any information from 802. 1 1 de- 
vices and other layer-2 monitors. User-level active prob- 
ing has been used to estimate conflict graphs and hid- 
den terminals, assuming that the involved devices coop- 
erate in the detection of hidden terminals fS] [18] [T9l . 
Instead, with WLAN-probe, hidden terminals may not 
participate in the detection process (and they may be lo- 
cated in different WLANs). Passive measurements have 
also been used for the construction of conflict graphs 
|l6l [m |24l. Earlier systems require multiple 802.11 
monitoring devices ^ |9] [iTl, NIC-specific or driver- 
level support for layer-2 information Q [23l, and net- 
work configuration data l?). Model-based approaches 
use transmission observations from the NIC to predict 
interference llT4l [161 1211 l22ll . Signal processing -based 
approaches decode PHY signals to identify the type of 
interference ITSll ; some commercial spectrum analyzers 
ill [5] deploy such monitoring devices at vantage points. 
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2 Wireless Access Delay 

The proposed diagnostics are based on a certain compo- 
nent of a probing packet's One-Way Delay (OWD), re- 
ferred to as wireless access delay or simply access delay. 
Intuitively, this term captures the following delay compo- 
nents that a packet encounters at an 802. 1 1 link: a) wait- 
ing for the channel to become available, b) a (variable) 
backoff window before its transmission, c) the transmis- 
sion delay of potential retransmissions, and d) certain 
constant delays (DIFS, SIFS, transmission of ACKs, etc). 
The access delay does not include the potential queue- 
ing delay at the sender due to the transmission of earlier 
packets, as well as the latency for the first transmission 
of the packet. The access delay captures important prop- 
erties of the link layer delays which allow us to distin- 
guish between pathologies; further, we can estimate it 
with user-level measurements. 

Before we define the wireless access delay more pre- 
cisely, let us group the various components of the OWD 
di of a packet / from C to 5 (see Figure [U into four de- 
lay components. We assume that the link between the AP 
and S does not cause queueing delays. First, packet / may 
have to wait at the sender NIC's transmission queue for 
the successful transmission of packet / — 1 - this is due 
to the FCFS nature of that queue and it does not depend 
on the 802.11 protocol. If the time-distance ("gap") be- 
tween the arrival of the two packets at the sender's queue 
is gi, packet / will have to wait for w, before it is available 
for transmission at the head of that queue, where: 

Wi^ma\{di-i-gi,0} (1) 

We can estimate w, only if packet / — 1 has not been 
lost - otherwise we cannot estimate the access delay 
for packet /. The second delay component is the first 
(and potentially last) transmission delay of packet /. In 
802.11, packets may be retransmitted several times and 
each transmission can be at a different layer-2 rate in 
general. The ratio Si/n^i represents the first transmis- 
sion's delay, where s, is the size of the packet (including 
the 802. 1 1 header and the frame-check sequence) and r,- j 
is the layer-2 rate of the first transmission; we focus on 
the estimation of r, j in the next section. The third de- 
lay component c includes various constant latencies dur- 
ing the first transmission of a packet; without going into 
the details (which are available in longer descriptions 
of the 802.11 standard), these latencies include various 
DIFS/SIFS segments and the layer-2 ACK transmission 
delay (which is always at the same rate). Finally, there is 
a variable delay component )3,. When the packet is trans- 
mitted only once, jS; consists of the waiting time ("busy- 
wait") for the 802. 1 1 channel to become available as well 
as a random backoff window (uniformly distributed in a 
certain number of time slots). If the packet has to be 



transmitted more than once, J3; also includes all the ad- 
ditional delays because of subsequent retransmission la- 
tencies, busy-wait, backoff times and constant latencies. 
These delay components are illustrated in Figure |4] We 
define the wireless access delay a, as 

fl,=c + i8, (2) 

and so it can be estimated from the OWD as 

a; = di -Wi-— (3) 

n.i 

where w,- is derived from Equation[T] 

Another way to think about the wireless access delay 
is as follows. Suppose that we compare the OWD of a 
packet that traverses an 802. 1 1 link with the OWD of an 
equal-sized packet that goes through a work-conserving 
FCFS queue with constant service rate r (e.g., a DSL or 
a switched Ethernet port). The OWD of the latter would 
include the sender waiting time w, and the transmission 
latency Si/r. In that case the term a, would only con- 
sist of the queueing delay due to cross traffic that arrived 
at the link before packet /. In the case of 802.11, the 
link is not work-conserving (packets may need to wait 
even if the channel is available), the transmission rate 
can change across packets, and there may be retransmis- 
sions of the same packet. Thus, the wireless access delay 
captures not only the delays due to cross traffic, but also 
all the additional delays due to the idiosyncrasies of the 
wireless channel and the 802.11 protocol. A significant 
increase in the access delay of a packet implies either 
long busy-waiting times due to cross traffic, or problem- 
atic wireless channel conditions due to low SNR, inter- 
ference etc. In the following sections we examine the 
information that can be extracted from either temporal 
correlations in the access delay, or from the dependen- 
cies between access delay and packet size. It should be 
noted that the access delay can have additional applica- 
tions in other wireless network inference problems (such 
as available bandwidth estimation), which we plan to in- 
vestigate in future work. 

Diagnosis tree and probing structure 

Having defined the key metric in the proposed method, 
we now present an overview of the WLAN-probe diag- 
nosis tree that allows us to distinguish between patholo- 
gies (see Figure |3]l. We start by analyzing each packet 
train separately, and use a novel dispersion-based method 
to infer the per-packet layer-2 transmission rate, when 
possible (Section|3]l. Based on the inferred rates, we can 
estimate the wireless access delay for each packet. We 
then examine whether the access delays increase with 
the packet size (Section HJ. When this is not the case, 
the WLAN pathology is diagnosed as congestion. On 
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Figure 3: WLAN-probe decision tree. 



the other hand, when the access delays increase with the 
packet size, the observed pathology is due to low SNR 
or hidden terminals. We distinguish between these two 
pathologies based on temporal correlation properties of 
packets that either encountered very large access delays 
or that were lost at layer-3 (Section|5]i. 

To conduct the previous diagnosis tests, we need to 
probe the WLAN channel with multiple packet trains 
and with packets of different sizes. Each train provides a 
unique "sample" - we need multiple samples to make any 
statistical inference. Each train consists of several back- 
to-back packets of different sizes. The packets have to be 
transmitted back-to-back so that we can use dispersion- 
based rate inference methods, and they have to be of dif- 
ferent sizes so that we can examine the presence of an 
increasing trend between access delay and size. Specif- 
ically, the probing phase consists of 100 back-to-back 
UDP packet trains. These packet trains are sent from 
the WLAN-probe client C to the WLAN-probe server 
S. The packets are timestamped at C and S so that we 
can measure their relative One- Way Delay (OWD) vari- 
ations. The two hosts do not need to have synchronized 
clocks, and we compensate for clock skew during each 
train by subtracting the minimum OWD in that train. 
The send/receive timestamps are obtained at user-level. 
There is an idle time of one second between successive 
packet trains. Each train consists of 50 packets of dif- 
ferent sizes. About 10% of the packets, randomly cho- 
sen, are of the minimum-possible size (8-bytes for a se- 
quence number and a send-timestamp, together with the 
UDP/IP headers) and they are referred to as tiny-probes 
- they play a special role in transmission rate inference 
(see Section[3]l. The size of the remaining packets is uni- 
formly selected from the set of values {8 + 200 xk. k — 
1... 7} bytes. 



3 Transmission Rate Inference 

The computation of the wireless access delay requires 
the estimation of the rate r,- j for the^rif transmission 
of each probing packet. Even though capacity estima- 
tion using packet-pair dispersion techniques in wired net- 
works has been studied extensively iTTOllTSl . the accuracy 
of those methods in the wireless context has been repeat- 
edly questioned ||2Ql . There are three reasons that capac- 
ity estimation is much harder in the wireless context and 
in 802.11 WLANs in particular. First, different packets 
can be transmitted at different rates (i.e., time-varying 
capacity). Second, the channel is not work-conserving, 
i.e., there may be idle times even though one or more 
terminals have packets to send. Third, potential layer-2 
retransmissions increase the dispersion between packet 
pairs, leading to underestimation errors. On the other 
hand, there are two positive factors in the problem of 
802.11 transmission rate inference. First, there are only 
few standardized transmission rates, and so instead of es- 
timating an arbitrary value we can select one out eight 
possible rates. Second, most (but not all) 802.11 rate 
adaptation modules show strong temporal correlations in 
the transmission rate of back-to-back packets. In the fol- 
lowing, we propose a transmission rate inference method 
for 802.11 WLANs. Even though the basic idea of the 
method is based on packet-pair probing, the method is 
novel because it addresses the previous three challenges, 
exploiting these two positive factors. 

Approach: Recall that WLAN-probe sends many 
packet trains from C to S, and each train consists of 
50 back-to-back probing packets (i.e., 49 packet-pairs). 
Consider the packets / — 1 and / for a certain train; we aim 
to estimate the rate r,- 1 for the first transmission of packet 
/ given the "dispersion" (or interarrival) A, between the 
two packets at the receiver S. Of course this is possible 
only when neither of these two packets is lost (at layer- 
3). Further, we require that packet / is not a "tiny-probe". 

Let us first assume that packet ; was transmitted only 
once. In the case of 802.11, and under the assumption 
of no retransmissions for packet /, the dispersion can be 
written as: 

A, = — +c + j3,- (4) 

ri,i 

using the notation of the previous section. To estimate 
r, 1 , we first need to subtract from A, the constant latency 
term c and the variable delay term j3,- which captures the 
waiting time for the channel to become available and a 
uniformly random backoff period. The sum of these two 
terms c + j3, is estimated using the tiny-probes; recall that 
their IP-layer size is only 8 bytes and so their transmis- 
sion latency is small compared to the transmission la- 
tency for the rest of the probing packets. On the other 
hand, the tiny-probes still experience the same constant 
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Figure 4: Timeline of an 802. 1 1 packet transmission showing access delays. 



latency c as larger packets, and their variable-delay j3 
follows the same distribution with that of larger prob- 
ing packets (because the channel waiting time, or the 
backoff time, do not depend on the size of the transmit- 
ted packet). So, considering only those packet-pairs in 
which the second packet is a tiny-probe, we measure the 
median dispersion A^jjjy. This median is used as a rough 
estimate of the sum c + j3,, when packet / is not a tiny- 
probe. Q We then estimate the transmission rate r, j as: 



If the i'th packet was retransmitted one or more times, 
the dispersion A, will be larger than i,7r, i + c + j3, and 
the rate will be underestimated. A first check is to exam- 
ine whether the estimated r, i is significantly smaller than 
the lowest possible 802. 11 transmission rate (1Mbps). In 
that case, we reject the estimate r, j and flag that packet. 
Of course it is possible that some remaining packets 
have been retransmitted, but without being flagged at this 
point. We also flag all tiny-probes, as well as any packet 
i if packet ; — 1 was lost. 

The next step is to map each remaining estimate r, i 
to the nearest standardized 802.11 transmission rate r, j. 
For instance, if r, i= 10.5Mbps, the nearest 802.1 1 rate is 
11Mbps. (note that this transmission rate applies to the 
802.11 frame and so i, has to include the layer-2 head- 
ers). 

We also exploit the temporal correlations between the 
transmission rate of successive packets (within the same 
train) to improve the existing estimates and to produce 
an estimate for all flagged packets. We have experi- 
mented with the four rate adaptation modules available in 
the MadWiFi driver used with the Atheros chipset (Sam- 
pleRate, AMRR, Onoe and Minstrel). Figure |5] (top) 
shows the fraction of probing packets in a train that were 
transmitted at the most common transmission rate dur- 
ing that train, under three different channel conditions. 
These results were obtained from 100 experiments with 

' This estimate is revised in the last stage of the algorithm, after we 
have obtained a first estimate for the transmission rate during a train. 
We then estimate the transmission latency of each tiny-probe and sub- 
tract it from its measured dispersion. 



50-packet trains; we also show the Wilcoxon 95% confi- 
dence interval in each case. Note that all rate adaptation 
modules exhibit strong temporal correlations, while three 
of them (AMRR, Minstrel and Onoe) seem to use a sin- 
gle rate for all packets during a train (each train lasts for 
5-250msec, depending on the transmission rate). 

Based on the previous strong temporal correlations, 
we compute the mode f (most common value) of the dis- 
crete Pi I estimates. If the mode includes less than a frac- 
tion (30%) of the measurements, we reject that packet 
train as too noisy. Otherwise, we replace every estimate 
r, 1, and the estimate for every flagged packet, with r. If 
most trains show weak modes (i.e., a mode with less than 
30% of the measurements), we abort the diagnosis pro- 
cess because the underlying rate adaptation module does 
not seem to exhibit strong temporal correlations between 
successive packets. In our experiments, this is sometimes 
the case with the SampleRate MadWiFi module. In the 
rest of this work, we only use that rate adaptation mod- 
ule (which is also the default in MadWiFi) because we 
want to examine whether the proposed diagnostics work 
reliably even under considerable rate estimation errors. 

Evaluation: Figure |5] (bottom) shows the accuracy of 
the proposed rate estimation method under three quite 
different channel conditions. In particular, we show the 
average of the absolute relative error across all probing 
packets for which we know the ground-truth transmis- 
sion rate. The "ground-truth" for each packet was ob- 
tained using an AirPcap monitor, positioned close to the 
sender, that captured most (but not all) probing packets. 
We detect the first transmission for each packet using the 
"Retry" flag in the 802. 1 1 header. We see that the infer- 
ence error is low in most cases; the SampleRate module 
gives a relatively higher error. 

4 Detecting Size-dependent 
Pathologies 

The first "branching point" in the decision tree of Fig- 
ure [3] is to examine whether access delays increase with 
the size of probing packets. Recall that each probing 
train consists of packets with eight distinct sizes. The 
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Figure 5: Rate inference: strong temporal correlations 
between the transmission rate of packets in the same train 
(top) and rate inference accuracy. Low-SNR conditions 
are created by separating C and its AP by several me- 
ters; congestion is caused by a UDP bulk-transfer over a 
second network that is in-range. 



pathologies in an 802. 1 1 WLAN can be grouped in two 
categories: a) pathologies that are more likely to increase 
the access delay of larger packets, because of increased 
waiting at the sender or increased retransmission likeli- 
hood, and b) pathologies that increase the access delay of 
all packets with the same likelihood, independent of size. 
We refer to the former as size-dependent pathologies and 
the latter as size-independent. 

The first category includes a broad class of problems 
such as bit errors due to noise, fading, interference, low 
transmission signal strength, or hidden terminals. In 
the simplest (but unrealistic) case of independent bit er- 
rors, the probability that a frame of size .s- bits will be 
received with bit errors when the bit-error rate is p is 
1 — (1 ~ pY, which increases sharply with s. Of course, 
in practice bit errors are not independent and 802.11 
frame transmissions are partially protected with FEC and 
rate adaptation techniques. We expect however that when 
the previously mentioned pathologies are severe enough 
to cause performance problems, larger packets have a 
higher probability of being retransmitted, causing an in- 
creasing trend between access delay and packet size. 

The size-independent class includes pathologies that 
can also cause large access delays, due to increased wait- 
ing at the sender or retransmissions, but where the mag- 
nitude of the access delay is independent of the packet 
size. The best instance in this class is WLAN conges- 
tion. It is important, however, that the traffic that causes 
congestion is generated by WLAN terminals that can 
"carrier-sense" each other (otherwise we have hidden- 
terminals). In the case of congestion, the access delays 
will be larger than the case when there is no congestion 
(packets have to wait more for the channel to become 
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Figure 6: Low signal strength and congestion: effect of 
packet size (SampleRate module). 



available) but the access delays would not depend on the 
packet size. 

Approach: We distinguish between the two pathology 
classes using statistical trend detection in the relation be- 
tween access delay and packet size. Figure |6] shows the 
inferred access delays from experiments with 100 packet 
trains. In the first experiment (left), the client C and the 
AP are separated by a large distance of 5-6m, so that C's 
bulk-transfer throughput drops to about 1Mbps. In the 
second experiment, we attempt to saturate the WLAN 
with UDP traffic that originates from another terminal. 
All terminals and APs can carrier-sense each other (we 
test this based on throughput comparisons when one or 
more nodes are active). We use 802. llg channel-6 and 
SampleRate in both experiments. 

The access delays in the case of low signal strength 
increase with the packet size, while this is not true in 
the case of congestion. A more thorough analysis of 
these measurements reveals that not all access delays in- 
crease with the packet size, under low signal strength. 
Instead, the increasing trend is clearly observed among 
those packets that have the larger access delays for each 
probing size. This is not surprising: the packets with the 
larger access delays among the set of packets of a cer- 
tain size, are typically those that are retransmitted, and 
the retransmission probability increases with the packet 
size under size-dependent pathologies. For this reason, 
instead of examining the average or the median access 
delay for each packet size, we consider instead the 95- 
th percentile 095^(5) of the access delays for each packet 
size s. 

The trend detection is performed using the nonpara- 
metric Kendall one-sided hypothesis test [.12J . The null 
hypothesis is that there is no trend in the bivariate sam- 
ple {s,agsp{s)} for 5 = {8 + A: X 200, A: = 1 . . .7} (bytes), 
while the alternate hypothesis is that there is an increas- 
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ing trend. 

Evaluation: For the experiments of Figure |6] the test 
strongly rejects the null hypothesis under low signal 
strength with a p-value of (the p-value is less than 
0.01 across all MadWiFi rate modules), while the p- 
value in the case of congestion is 0.81 (0.7-1.0 across 
all MadWiFi rate modules). We have repeated simi- 
lar experiments with all other MadWiFi rate adaptation 
modules and under different signal strengths and conges- 
tion levels. The p-values in all experiments show a clear 
difference between size-dependent and size-independent 
pathologies, as long as the received signal strength is less 
than about 8-lOdBm. For higher signal strengths, the 
user-level throughput is more than 5Mbps, and so it is 
questionable whether there is a pathology that needs to 
be diagnosed in the first place. 

5 Low SNR and Hidden Terminals 

After the detection of a size-dependent pathology, 
WLAN-probe attempts to distinguish between low-SNR 
conditions and Symmetric Hidden Terminals (SHTs). 
The former represents a wide range of problems (low 
signal strength, interference from non-802.11 devices, 
significant fading, and others) - a common characteris- 
tic is that they are all caused by exogenous factors that 
affect the wireless channel independent of the presence 
of traffic in the channel. SHTs represent the case that 
at least two 802.11 senders (from the same or different 
WLANs) can not carrier-sense each other and when they 
both transmit at the same time neither sender's traffic is 
correctly received. SHTs do not represent an exogenous 
pathology because the problem disappears if all but one 
of the colliding senders backoff. The case of asymmetric 
HTs (or one-node HTs), where one sender's transmis- 
sions are corrupted while the conflicting sender's trans- 
missions are correctly received, is no different than the 
exogenous factors we consider and WLAN-probe will 
diagnose them as low-SNR. 

Approach: To distinguish between low-SNR and 
SHTs, we first introduce some additional terminology of 
events that probing packets may see. A probing packet 
may be lost at layer-3 (denoted by L3), after a number 
of unsuccessful retransmissions at layer-2. A probing 
packet may see an outlier delay (OD), if its access de- 
lay is significantly higher than the typical access delay 
in that probing experiment - we classify a packet as OD 
if its access delay is larger than the sample median plus 
three standard deviations (the sample includes all mea- 
sured access delays in that probing experiment - across 
all trains). Finally, a probing packet may see a large 
delay (LD) if its access delay is higher than the typical 
access delay in that probing experiment - we classify a 
packet as LD if its access delay is higher than the 90-th 




Probability ratio 

Figure 7: Probability ratio (pc/pu) to distinguish be- 
tween low-SNR and SHT conditions. 

percentile of the empirical distribution of access delays 
(after we have excluded OD packets). Note that the ac- 
cess delays of OD packets are typically much larger than 
the access delays of LD packets. 

The probing and diagnosis process works as follows. 
The probing packets in this WLAN-probe experiment are 
of the largest possible size that will not be fragmented. 
The reason is that larger packets are more likely to collide 
with other transmissions in the case of SHTs. We then 
identify all OD or L3 packets in the probing trains of the 
experiment, and estimate the unconditional probabiUty 
Pu that either event takes place: 

==Prob[ODVL3] (6) 

We then focus on the successor of an OD or L3 event, 
i.e., the probing packet that follows an OD or L3 packet. 
Under low-SNR scenarios we expect that the channel 
conditions exhibit strong temporal correlations, and so 
if a packet / experiences an OD or L3 event, its successor 
packet / + 1 (denote by successor( i)) will see a large de- 
lay (LD) or layer-3 loss (L3) event with high probability. 

On the other hand, if packet / experiences an outlier 
delay (OD) or L3 event due to an SHT, the colliding 
senders will backoff for a random time period and it is 
less likely that the successor packet will be LD or L3. 
To capture the previous temporal correlations between 
an L3 or OD packet and its successor, we consider the 
conditional probability: 

Pc = Prob [successor(;) : LD V L3 | / : OD V L3] (7) 

The detection method focuses on the ratio p^/ Pu of 
the previous conditional and unconditional probabilities. 
If there is a strong temporal correlation between a prob- 
ing packet that experiences an OD or L3 event and its 
successor, this probability ratio will be much larger than 
one. We expect this to be the case under low-SNR condi- 
tions. Otherwise, under an SHTs condition, the previous 
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temporal correlation is much weaker and the probability 
ratio will be closer to one. 

Evaluation: Figure Q shows the distribution of the 
probability ratio pc/pu for 100 low-SNR and 90 SHT 
experiments in our testbed. We create low-SNR condi- 
tions by reducing the transmission power of the WLAN- 
probe client C to 6-lOdBm; the access point is about 
3m away. We create SHTs conditions using two differ- 
ent networks on 802.1 Ig channel-6, such that the two 
senders can not carrier-sense each other When only 
one sender is active, the throughput in the correspond- 
ing network is higher than 10Mbps - when both senders 
are always backlogged, the throughput drops to less than 
1Mbps. The probability ratio is always less than 5 under 
SHTs, while it is higher than 5 in 80% of the experiments 
under low-SNR conditions. A probability ratio threshold 
between 3-5 should be sufficient to diagnose almost all 
SHTs accurately. Under low-SNR conditions, however, 
we should expect some diagnosis errors: in 10-20% of 
the cases, WLAN-probe will diagnose a low-SNR con- 
dition as SHT. We are investigating ways to further im- 
prove the accuracy of this diagnostic process. 

6 Conclusions and future work 

We proposed a home WLAN diagnosis process that only 
requires user-level active probing, and presented some 
preliminary but promising results that show the feasi- 
bility of such diagnostics. A design consideration for 
our methods is usability: we do not require adminis- 
trative privileges, any form of support from the wireless 
card/driver/ AP, or sensor nodes at vantage points in the 
home. 

We are working on several extensions of WLAN- 
Probe. First, it is possible that there is no real WLAN 
pathology - we are working on a method that can dis- 
tinguish between normal operation and the previous 
pathologies. Second, some preliminary work shows that 
we can detect certain non-802.11 interference sources, 
such as microwave ovens. Third, we are working on 
improvements in the rate inference method and on test- 
ing these methods with additional rate adaptation mech- 
anisms. Finally, we will conduct a larger-scale evalua- 
tion of the WLAN-probe diagnostic accuracy with more 
testbed experiments as well as with actual home WLAN 
deployments. 
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